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DUAL ADAPTIVE CONTROL: DESIGN PRINCIPLES AND APPLICATIONS 


Purusottam Mookerjee, Ph.D. 

The University of Connecticut, 1986 

The design of an actively adaptive "dual” 
controller based on an approximation of the stochastic 
dynamic programming equation for a multi-step horizon 
is presented. A dual controller that can enhance 
identification of the system while controlling it at 
the same time is derived for multidimensional 
problems. This dual controller uses sensitivity 
functions of the expected future cost with respect to 
the parameter uncertainties. A passively adaptive 
“cautious" controller and the actively adaptive “dual" 
controller are examined. In many instances, the 
cautious controller is seen to turn off while the 
latter avoids the turn-off of the control and the slow 
convergence of the parameter estimates, characteristic 
of the cautious controller. The algorithms have been 
applied to : 

1) a multivariable static model which represents a 
simplified linear version of the relationship between 
the vibration output and the higher harmonic control 
input for a helicopter and 

2) a dynamic model that has similarity with an 
ore-crushing plant or a heat exchanger model. 

Monte Carlo comparisons based on parametric and 
nonparametric statistical analysis indicate the 
superiority of the dual controller over the baseline 
controller. 
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Chapter 1 


INTRODUCTION 

Research on adaptive control started in the 
early fifties [ A 2 ] . The design of autopilots for 

aircraft for a wide range of speeds and altitudes 
motivated the research on adaptive control. For this 
wide range of operating conditions the use of adaptive 
control was deemed necessary. However, progress in 
this field has been quite slow because of the lack of 
understanding of the inherently nonlinear adaptive 
systems and the first results began to appear only in 
the sixties. During that period, pioneering research 
toward understanding the theory of adaptive control 
was conducted [ B 6 , F1J and this laid down the 
foundations of adaptive control research of today. At 
the present time C A 1 3 this research has gained a lot 
of momentum because of : Cl) the advent of digital 

computers, and, in particular, microprocessors, and 
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(2) the successful applications of adaptive control in 
the aircraft industry. 

Most application areas of adaptive control can 
be mathematically modeled by m u 1 ti - vari ab 1 e systems 
with some or all parameters unknown. The control of 
such systems can not be handled by deterministic 
control theory. The unknown parameters are modeled by 
random variables and the unknown disturbances in the 
system are modeled as stochastic processes and their 
control constitutes the framework of stochastic 
control theory. The use of the 

Pro po r ti o na 1 - I n t eg r a 1 -D eri va ti v e (P1D) regulator for 
the control of such industrial processes is appealing 
for its simplicity. For an industrial process, 
however, it is a colossal task to tune a large number 
of control gains involved. Under these circumstances, 
adaptive control is needed. The adaptive control 
techniques handle the industrial processes with 
uncertain parameters by combining system 
identification and control design. In the Bayesian 
framework these controllers assume that the parameters 
have prior probability density functions and large 
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uncertainty associated with their initial estimates. 
In the process of simultaneous system identification 
and control, these controllers reduce the uncertainty 
associated with the parameter estimates, i.e., 
learn and control the system. This is the 
basic philosophy of adaptive control. 

The design of a controller is a result of an 
optimization algorithm on a performance index or cost 
function. This index is generally defined as a 
function of system's actual output and its desired 
output. For systems with uncertain parameters, the 
control solution which optimizes over a multistage 
horizon is obtained by solving the stochastic dynamic 
programming equation [B5, and eq(10) of this report]. 
However, it is not possible to achieve an optimal 
solution because of the dimensionality involved in the 
stochastic dynamic programming . In such situations, 


emphasis is put on obtaining a suboptimal solution 
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that incorporates the intrinsic properties of the 
optimal solution. For stochastic systems, the control 
has in general a dual effect [ B i , F 1 3 : it affects the 
system's state as well as the future state and/or 
parameter uncertainty. This property is shared by all 
control policies, whether, or not, it has the property 
incorporated in its design. Thus a control law, which 
explicitly utilizes this property in its design, 
called a dual controller, offers significant 
improvement potential for the control of uncertain 
linear plants. In multistage problems it probes 
the system to enhance real-time identification of the 
system's parameters in order to increase the accuracy 
of the subsequent control decisions and regulates the 
system at the same time [B3,D1]. Thus the controller 
has two different tasks and the dual controller 
compromises between good control and good 
identif ication of the system. 

Simpler controllers which do not account for 
any dual effect are also investigated here. One of 
them estimates the system's parameters based upon all 
available information and uses those estimates as 
though they were true. This is called the Heuristic 



Certainty Equivalence (HCE) controller [Bl]. It is 
similar in form to the deterministic controller except 
it uses the parameters' estimates in the derivation of 
the control input. The other one, called the cautious 
controller, uses the parameter estimates as well as 
their associated current covariances. In an uncertain 
situation, the latter can be overly ‘cautious' because 
of the parameter uncertainty. Another problem of this 
controller is the turn-off phenomenon, when the 
control almost vanishes over significant lengths of 
time. Thus the controller cannot estimate the 
system's parameters and loses control over the 
system [A1J. 

Two classes of dual controllers exist 
presently. In the first class [El, Gl, Ml, M5, Wl], 
the control minimizes a one-step-ahead criterion 
augmented by a second term which penalizes for poor 
identification. The approach is simple but does not 
fully exploit the dual property and often requires 
tuning of some parameters. Padilla and Cruz [PI] give 
a dual control solution for a plant by minimizing the 
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control objective function subject to an upper bound 
in the total estimation cost. Their objective 
function includes a standard cost function and also a 
constraint term which reflects the sensitivity of the 
parameters to the state of the system. Thus the 

solution adjusts itself to exercise better estimation 
for such sensitive parameters within the upper bound. 
The second class [B2, B 4 , Si, S2, Tl] uses the 
stochastic dynamic programming equation and expands 
the future cost about a nominal trajectory. The 
approach of this second class is different from that 
discussed in [ A i ] . The method proposed in [Ai] 
formulates the Stochastic Dynamic Programming Equation 
but suggests no expansion of the expected future cost 
about any nominal trajectory. Thus no minimization is 
possible explicitly except at the last step and the 
expected cost is minimized for two steps by numerical 
integration. 

The recently developed linear feedback dual 
controller of [B4] is based upon a first order Taylor 
series expansion of the expected future cost and is 
called the first order dual solution, Dl. This 
solution, Dl, although simple, does not capture all 
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the dual effect available from the future expected 
cost [ M4 ]. A second order Taylor series expansion 
handles it better and yields the second order dual 
solution, D2, in [M2]. The D2 solution modifies the 
cautious controller with a numerator “probing" term 
and a denominator correction term. Performance 
comparisons are available in [M2] among the cautious, 
Dl and D2 solutions for a scalar model. Both 
the cautious and the Dl solutions turn off but the D2 
solution avoids turn-off, indicating that Dl is not a 
satisfactory dual solution. In this dissertation, the 
D 2 solution is developed for multi-variable 
input-output system in Chapter 2 and both the cautious 
and the D2 solutions are applied to a multi-variable 
input-output system. Monte Carlo simulations are made 
which indicate that the D2 solution prevents the 
turn-off phenomenon prevalent with a cautious 
solution. However, there are few occasions where it 
demonstrates excessive probings this is handled by a 
control limiter. A second order Taylor series 
expansion of the future expected cost is performed 
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about a nominal trajectory and a dual controller is 
developed and applied to a Ml MO dynamic (ARM A of 
lag one) model in Chapter 3. Monte Carlo simulations 
and parametric and no n pa r a me tri c statistical tests of 
significance indicate the superiority of the dual over 
the cautious and the heuristic certainty equivalence 


control 1 ers. 



Chapter 2 


Dual Control and Prevention of the Turn-Off 
Phenomenon in a Class of MIMO Systems 


2.1 INTRODUCTION 

In this chapter, a dual solution is developed 
based on a second order expansion of the expected 
future cost and both the cautious and the D2 solutions 
are applied to a mu 1 ti - varia bl e input-output system. 
Monte Carlo simulations are made which indicate that 
the D2 solution prevents the turn-off phenomenon 
prevalent with a cautious solution. However, there 
are few occasions where it demonstrates excessive 
probing; this is handled by a control limiter. Monte 
Carlo simulations and statistical tests of 
significance indicate the superiority of the dual over 
the cautious and the heuristic certainty equivalence 
controllers. 


9 
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Section 2 gives the problem formulation. 
Section 3 discusses the turn-off phenomenon observed 
in a stochastic environment. The approximate dual 
controller for the m u 1 ti - v aria b le input-output system 
is provided in Section 4 . Section 5 describes the 
simulation of the plant and compares the performances 
of the cautious, dual (D2) and the HCE solutions. 


Section 6 concludes the chapter. 



2.2 PROBLEM FORMULATION 


The multivariable plant considered is 

x(k«-i) - c «• B u(k) Cl) 

where c is an unknown vector and B is a matrix of 
unknown parameters. This static model with constant 
parameters represents a simplified helicopter 
vibration control problem under steady flight 
conditions [M4, W2] and defines a relationship between 
the higher harmonic control input vector u and 
the vector x of vibration output amplitudes. These 
controls can cancel some of the unsteady air loads on 
the blades. The unknown elements of c and B 
comprise the parameter vector 0(k) whose estimate 

A 

at time k is 0(k) with covariance matrix 
P(k) . Assuming the parameters are time-invariant, 
we have 


0(k+l) - 0(k) 


( 2 ) 
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The measurement vector is given by 


y(k) = x(k) w(k) (3) 


where 


E[ w(k) ] - 0; E[ w(k)w'(j) ] - W6 kj (4) 

with xCk) , y(k) being n dimensional vectors. 
The performance criterion to be minimized is the 
expected value of the cost from step 0 to N , 

N 

J(0) - E{C(0) } = E{ £ x'(k)Qx(k) «■ u'(k-l)Ru(k-l) ! I k } (5) 

k-1 


where N=2 for the two-step horizon. 
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2.3 CAUTIOUS CONTROL AND THE TURN-OFF PHENOMENON 

For the sake of illustration let us consider a 
scalar plant with one unknown gain parameter as 

x(k+l) = c * b u(k) (6) 

and obeying (2) - (5). 

The cautious controller, designed with a one 
step horizon (N-l) , is obtained by minimizing (5) 

for the plant (6) with Q»1 and R=0 i.e., 

m i n E{x 2 (1) } (7) 

u(0) 

This is given by 

( 8 ) 

where P' b (0) is the associated variance of the 


u c (0) 


b (0) * P b (0) 
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In the case of constant but unknown parameter, 
the controller assumes initial 1 y that the parameter 
has a prior probability density function with a large 
uncertainty. The parameter uncertainty will evolve as 
(9) and, the controller tends to adapt itself to the 
system and gradually learn the system with time. 

From (8) it is clear that if b ( 0 ) is very 
small compared to P b C 0 ) , the control u c ( 0 ) 

will also be very small. Moreover, if u c (Q) is 
small, there is no learning and the covariance stays 
practically unchanged. When this situation occurs, it 
stays so until there is a large measurement noise 
which alters the parameter estimate and brings the 
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system out of turn-off. This leads often to a 
burst phenomenon. The dual controller presented 
here and in [M2] have sensitivity correction terms 
which are usually large in such situations and avoid 
the turn-off phenomenon. The occurrence of the 
turn-off phenomenon is well understood in the 
context of a scalar model. This is further discussed 
later for a multidimensional system in Section 5. 
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2.4 DUAL CONTROL WITH A TWO-STEP HORIZON 

A dual control solution with a two-step horizon 
is obtained by minimizing (5) with respect to the 
control u(0) for the multidimensional plant (l)-(4). 
This is obtained by solving the general equation of 
Stochastic Dynamic Programming [B6, B7] 

J*(k) - min E{C(k) ♦ J*(k+l)|l k > k-N-1 1,0 (10) 

u(k) 

where J* ( k ) is the cost to go from k to 

N and I k is the cumulated information at time k 

when the control u(k) is to be applied. 

For N = 1 , Eq. (10) becomes 

J*(0) = min E(x'(l)Qx(l) ♦ u'(0)Ru(0) «■ J*(l) I 1°) (11) 

u(0) 
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where J* ( 1 ) is the optimal cost at the last step 
and is obtained by minimization of J(N-l) for N=2 . 

The last control is easily obtained by 
minimizing J(l) and is given by 

u*(l) ■ -I R ♦ E{B'(l)QB(i) II 1 )] -1 E{B'(l)Qc(l) II 1 } (12) 
Thus inserting u* ( 1 ) into J(l) we obtain 

J*(l) = E{c'(l)Qc(l) 1 1 1 } 

- E{c'(l)QB(l) 1 1 1 ) IR ♦ E{B'(l)QB(l)|! 1 )]" t E{B'(l)Qc(l) II 1 } (13) 

where E ( • I l 1 > is the conditional expectation 
given the available information I 1 

The parameter vector estimate 8(1) and 
the associated covariance matrix P ( 1 ) are obtained 
from a Kalman filter according to 
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0(1) = 0(0) + K(l)[y(l) - H(1)0(O) ] = 0(0) + K(l) v(l) (14) 

K ( 1 ) - P(0) H'(l) (H(1)P(0)H'(1) + W]" 1 (15) 

P(l) - P(0) - K(l) H ( 1 ) P(0) (16) 

where 

H(l) = diag [H(l) , H(1)J (17) 

H(l) - [1 u' ( 0 ) ) (18) 

From (13) it is clear that J*(l) is a 
nonlinear function of the estimated parameter vector 

0(1) and covariance P ( 1 ) But the estimated 

vector 0(1) and the covariance P(l) are not 

known until the control u(0) is applied. 

A control u(0) with a two-step horizon can be 
obtained from (11) if a Taylor series expansion of 
J*(l) is performed about a suitable nominal 
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trajectory. Here a second order expansion of 
J* C 1 ) is proposed about a nominal parameter 
estimate 8(1) and a nominal covariance P ( 1 ) . 


Expansion of (13) about 0(1)=0(O) 
and P ( 1 ) results in, 


J*(l) - J*[l,e(l),P(l)l ♦ [J e (l)]'[0(l) - 0(0)] 

♦ |[0(1) - e(O))'J M (l)[0(l) - 0(0)] + tr[J p (l){P(l) - P(l))l (19) 


where the sensitivities defined by 


J 0 (l) £ 


faJlnl 

L ae»Ci) J 


( 20 ) 


JeeU) 


r a 2 J*d) ~| 

|_ 30^1)30^1) J 


( 21 ) 


J P (1) £ 


r 3 j*(d 

L 3P iJ (l) 


] 


( 22 ) 
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are evaluated at 0(1)-0(O) and at P ( 1 ) ; 

and P^ ( 1 ) is the ij-th element of the 

covariance matrix associated with the parameter 

estimates 0j ( 1 ) and 0j ( 1 ) . With this 

particular choice of 0(1) and using (14) the 
conditional expected value of (19) is 

E[ J*(l) 1 1° ] - J*[l, 0(0), P(l)] 

♦ |tr[J ee (l)K(l) E{v(l)v'(l) | I°}K'(1) 1+ tr[J p (l){P(l) - P(l)}](23) 

Making use of (15), (16) and the innovation covariance 

it is clear that (23) yields, 

E[J*C1)|I°] - J*ll, 0(0), PCD 1 ♦ 2”tr[ J ee (l)(P(0) - P(l) ] 

♦ trlJpClHPd) - P(i) ) ] (24) 

The expected future cost (24) is a function of the 


( 
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covariances multiplied by appropriate sensitivity 
functions J ee (1) and J p (l) . These 

sensitivities introduce the dual effect into (11). 
For the first order dual solution D1 of [ B 4 ] the 
sensitivity ( 1 ) is not present and thus 

the second order dual solution D2 is expected to 
exploit better the dual effect in the problem. Again, 
it must be noted now that the covariance P(l) is 
nonlinear in u(0) and is not yet known. Hence a 
second order expansion of P(l) is proposed about a 
nominal control 0(0) and a nominal covariance 
P(l) in order to obtain a (suboptimal) dual 

solution u D (0) in a closed form from (11). Two 

choices of 0(0) will be discussed later on when 

the implementation of the algorithm is described. 
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This expansion is performed as follows 


P(l) = P(l) * Seje'^pymCuCO) - 0(0)) 
+ |[u(0) - D(0)rP‘j(l)tu(0) - 0(0) ] } 


with superscript here denoting matrix element and 



evaluated at P(i) and 0(0) 


(25) 

(26) 
(27) 


Now a (suboptimal) dual solution u 0 ( 0 ) can 
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be obtained from (11) using (24) - (27) and is given 
by 

u D (0) = - (R E{B'(0)QB(0) |I°) * F] _, [E{B'(0)Qc(0) |I°) + f](28) 

where the elements of the matrix F and those of the 
vector f are given by 

F id * tr [^2 {J P (1) " 2 J ®0 Cl 3 } 3u 1 ( 3 O , )8u](O) ] 

i,j - 1 ,m (29) 

a nd 

fi " **!>*« ~ " SUjfojauJcO ) 0 ^ 03 } J 

i-1, ,m (30) 

m being the dimension of the control vector. 
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It is clear from (28) that this approximate 
dual solution u 0 (0) is a modification of the 
cautious solution by the sensitivity terms J p (l) , 
J 09 ( 1 ) . P u ( 1 ) . P uu C 1 ) ■ These 

account for the dual effect. 

The implementation of this second order dual 
solution (D2) (28 )-(30 ) can be performed in three 

ways: 

(D2a) direct or explicit method, 

(D2b) multidimensional grid search method, and 
(D2c) adaptive grid search method. 

These are summarized next: . 

Algorithm D 2a 

1. Choose a nominal control 0(0) 

2. Using this nominal control 0(0) evaluate 

P ( 1 ) according to (15) - (18). 
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3. Using the 0(0) . e ( 1 ) « 6 ( 0 ) 

P(i) , compute the sensitivities required in (29), 
(30) and obtain u 0 (0) from (28). 

Algorithms 02b and 02c 


1. Choose a nominal control 0(0) 

• 


2. Using this 

nominal 

co ntrol 

0(0) 

evaluate 

PCI) according 

to (15) 

- (18). 

This 

is the first 

nominal control 

0(0) 

and covariance 

P ( 1 ) . 

3. Compute 

the s 

e n s i t i v i 

t y f 

unctions 

Jee ( 1 ) 

J P ( 1 ) 

for 

( 2 4 

) with 

8 ( 1 ) - 0 ( 0 ) a 

n d the 

first n i 

o m i n a 

1 values 


0(0) , P ( 1 ) 

4. Search on (11) with (24) (with the sensitivity 
functions computed above) starting with the first 
nominal values 0(0) , P ( 1 ) over u(0) to 

obtain an improved nominal u 1 ( 0 ) for which 
J*(0) is lower than that with the first nominal 
0(0) P(l) is expanded about this u 1 ( 0 ) in 
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(25). This search is a fine m u \ ti di m e ns i o n a 1 grid 
search in D2b. It is quite time consuming in terms 
of computation and may not be justified as a practical 
implementation. It is improved by the adaptive grid 
search in D2c. Instead of a fine multidimensional 
grid in 02b, a coarse grid is selected for Q2c and an 
improved nominal control is obtained. Then another 
coarse grid is chosen about the lattei; nominal control 
over a narrower interval and a refined u 1 ( 0 ) is 
obtained. This reduces the computational burden 
considerably, especially for multidimensional systems. 
5. Using this u 1 ( 0 ) compute P u (l) , 

P uu ( 1 ) ; together with the previously computed 

J ee ( 1 ) , Jp(l) obtain F , f from (29), 

(30) and get a u 0 (0) from (28). 
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2.5 SIMULATION RESULTS 

Performance was evaluated from Monte Carlo runs 
for the following controllers: 

1) Heuristic Certainty Equivalence, 

2) One step ahead cautious controller, and 

3) Dual solution (D2) based upon sensitivity 
correction (with two-step horizon). 

This is implemented in three ways: 

(D2a) direct or explicit method, 

(D2b) multidimensional grid search method, and 
(D2c) adaptive grid search method. 

The plant equations are [M4, W2] 

x t (k*l) = 0, ♦ 0 2 u t (k) ♦ 0 3 u 2 (k) (31) 


x 2 (k*l) - e 4 ♦ 0 5 u,(k) ♦ 0gU 2 (k) 


(32) 
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This model represents a simplified helicopter 
vibration control problem where the first state x, 
is the rotor hub force amplitude and the second state 
x 2 is the rotor blade bending moment amplitude at a 
given frequency (i.e., one of the harmonics of the 
rotor r.p.m.). The two controls are the higher 
harmonic controls and they cancel some of the 
unsteady air loads on the rotor. 

The measurements are 

y^k) = x t (k) ♦ w,(k) 
y 2 (k) - x 2 (k) f w 2 (k) 

w here 

E{ w(k)w'(j) } - W6 kj - diag (W,, W 2 ); 

W t = 7.52 2 , W 2 = 43 2 (35) 

Only the gain parameters were unknown and their 


(33) 

(34) 


29 


initial estimates were generated as N ( 0j , 
0? ) , i - 2,3,5 ,6 where the true values are 

0 t - 23.8 0 4 = -135.8 7 

0 2 = -74.84 0 S - 53.31 (36) 

0 3 » - 5 1.04 8 b - -82.56 

A large uncertainty is chosen in the initial parameter 
estimates in order to test the learning capabilities 

of the various adaptive algorithms. The cost 
weighting matrices are 

Q => diag(q it q 2 ): « 1.0, q 2 = 1.0 

R = diag (r,. r 2 ): = 0. , r 2 = 0. (37) 

For the model chosen (31)-(36) the optimal control 
solution is 

uj - 1.0, u 2 - 


- 1.0 


(38) 
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In terms of the notation of Section 2 


c ■ [ Si 3 ' 39 > 

MO, 

u(k) ■ [ u’jk, 3 Hi) 

The controllers are implemented with a sliding 
horizon for a total of 20 time steps. The 
evaluation criterion is 

C k = q^fOO + q 2 x|(k) (42) 


Analysis of the Monte Carlo Average Costs 


Comparisons are made between the performances 
of the cautious and the various dual algorithms 
(D2a-02c) on the system and a conventional statistical 
significance analysis is done using the normal theory 
approach (i.e., it is assumed that the central limit 



theorem holds for the sample mean from a large number 
of runs). This is given in Appendix A. Tables 1-6 
contain the results of the simulation runs. Table 1 
compares the average cost C k over 100 Monte 
Carlo runs for the first 10 time steps for HCE, 
Cautious and the dual algorithms, with an active 
control limiter |Uj| £ 1.5, 1-1, 2. 

Clearly it is seen that the cumulative average 
cost is the lowest for the dual controller. The HCE 
increases the vibrations in time step 1 by using too 
large control magnitudes because of lack of caution. 
This however helps to learn the parameters faster and 
reduces the vibration earlier than the others. The 
dual controller sometimes demands large control 
magnitudes but less often than the HCE. In a 

realistic situation large control magnitudes are not 
permitted because of the active control limiters 
discussed above. Tables 2-4 provide a statistical 
significance test for the run with the limiter and 
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show that the dual solutions improve upon the cautious 
solution on the average by 60/ with at least 95/ 
confidence. Table 5 shows the percentile test 
comparing the cautious and the dual (D2c) solutions 
(Appendix A). It clearly indicates that from time 
steps 3 onwards the tail of the dual is lighter than 
that of the cautious solution. This test was carried 
out for 5 00 Monte Carlo runs. Also a sample 
distribution function plot was made for the vibration 
cost at each time step comparing the two algorithms; 
figures 6, 7 are typical examples. From the plots a 
threshold value of 5000 was chosen for the cost and 
Table VI indicates the percentage of runs the 
vibration cost exceeds 5000 for the two algorithms. 
This also indicates the light tailed nature of the 
distribution obtained by the dual algorithm. 

Individual Time History Runs 

Analysis of the Monte Carlo Average Cost 
indicates the improvement offered by the dual 
solution: but provides no information about the 
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cautious controls turning off. Hence a careful 
investigation of the individual runs is required to 
discover these occurrences. Turn-off phenomenon is 
observed in many runs among the 100 Monte Carlo runs 
while using the cautious controller; runs 11, 60, 94, 
98 are typical examples of it. In run 11 control 2 
turns off between the time steps 0 and 16. In run 60, 
the control 2 turns off between the time steps 0 and 
8. In run 94 also the control 2 turns off between the 
time steps 0 and 6. The worst case of turn-off occurs 
in run 98. Both the controls are off between the time 
steps 0 and 12. Here at time step 13 another 

interesting phenomenon called burst occurs. The 
cautious control exceeds the limits and this reflects 
in a small hump in the cost curve at time step 13. In 
all these cases the dual does better and avoids the 
turn-off and the burst phenomena. As 
explained in Section 3, the control for a constant 
parameter plant revokes from the turn-off situation by 
the burst phenomenon. A large measurement noise helps 
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the plant to come back to life by causing a 
burst. Run 89 (Figure 5) is a case where the 

cautious controller exercises excess of caution and is 
slow in convergence. This is avoided by the dual 
solution. These cases are portrayed in Figures 3-7. 
For the latter case, the controller goes to the right 
direction of control by utilizing the dual effect from 
the very outset. Analysis of the simulation runs has 
shown that this new dual control solution applied to a 
multi-variable input-output model improves the cost on 
the average by 60X. The key improvement is in the 
avoiding of situations like turn-off, burst and slow 
convergence, typical of the cautious solution. 
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Table 1: 


Average costs for the three algorithms in the 
static model with a limiter (100 Monte Carlo 
Runs; J |_<1 . 5 ; |u 2 |<L.5) 










































































Table 2 ! Statistical significance test for 

comparisons of cautious and the dual 
algorithms (D2a) in the static model with 
a limiter ( | u, I £ 1 . 5 , I u 2 I £ 1 . 5 ) 
(100 Monte Carlo Runs). 


Time Step 
k 

Test Statistic 
2 k 

Estimated Improvement 
EI k X 

1 

.29 

.9 

2 

2.2 

17. 

3 

4.4 

51. 

4 

2.0 

40 . 

5 

1.7 

44 . 

6 

2.7 

63. 

7 

2.8 

71. 

8 

2.7 

84 . 

9 

2.4 

61. 

10 

2.2 

60. 


Statistical significance test for 
comparisons of cautious and the dual 
algorithms (D2c) in the static model with 
a limiter ( I u t I £ 1 . 5 , I u 2 I £ 1 - 5 ) 
(100 Monte Carlo Runs). 



Time Step 
1 


X 2 test statistics at K go 


Table 5: 


1.6 

0.4 

30 

48 

33 

25 

27 

27 

23 

19 


2 

3 

4 

5 

6 

7 

8 

9 

10 


Percentile test for comparisons of cautious 
and the dual algorithms (D2c) in the static 
model with a limiter (500 Monte Carlo Runs 
I u t 1 < 1 . 5 , | u 2 1 < 1 . 5 ) 
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Time 

Percentage of runs the vibration exceeds 

5000 

k 

Cautious 

Dual 

1 

94 

86 

2 

41 

29 

3 

23 

8 

4 

16 

2.4 

5 

11 

1.2 

6 

9 

0.8 

7 

7.4 

0.8 

8 

6 

0.6 

9 

6 

0.6 

10 

5.2 

0.4 


Table 6: Comparison of the tails using the cautious 

and the dual algorithm (D2c) in the static 
model with a limiter (500 Monte Carlo Runs 
I u, | £ 1 . 5 . | u 2 I < 1 . 5 ) 



No. of Samples 
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SAMPLE DISTRIBUTION TIME STEP 4 



COST VALUE x 1000 


Fig. 1 Sample distribution of vibration cost using 

cautious and dual (D2c) algorithms (500 Monte 
Carlo Runs); ( j u ^ | _< 1 . 5 ; j^lj^l.S) 


No. of Samples 
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SAMPLE DISTRIBUTION TIME STEP 5 



COST VALUE x 1000 


Fig. 2 Sample distribution of vibration cost using 

cautious and dual (D2c) algorithms (500 Monte 
Carlo Runs); ( J u x |j<1 . 5 ; |'u 2 |<1.5) 


COST x 1000 
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PLOT 1 1 



TIME STEP 


Fig. 3a Time history of cost and controls using the 
cautious and the dual algorithms for run 11 
(100 Monte Carlo Runs: |u^|<1.5; 1^1,51.5) 
(see pages 44, 45) 



CONTROL 





COST x 1000 


46 


PLOT 60 



TIME STEP 


Time history of cost and controls using the 
cautious and the dual algorithms for run 60 
(100 Monte Carlo Runs: |ujJ<1.5; | u^ 1^^- * 5) 
(see pages 47, 48) 


Fig. 4a 


CONTROL 1 






COST x 1000 


49 


PLOT 89 



TIME STEP 


Fig. 5a Time history of cost and controls using the 
cautious and the dual algorithms for run 89 
(100 Monte Carlo Runs: |u.. |<1.5; | * ^) 
(see pages 50, 51) 










COST x 1000 


55 


PLOT 98 



TIME STEP 


Fig. 7a Time history of cost and controls using the 
cautious and the dual algorithms for run 98 
(100 Monte Carlo Runs: |u^|£l.5; 1^ 1.5.1 *5) 
(see pages 56, 57) 




CONTROL 2 


PLOT 98 



TIME STEP 


Fig. 7c Control 2 (see pages 55, 56) 
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2.6 CONCLUSIONS 

A new adaptive dual control solution is applied 
here to a m u 1 ti - v aria b 1 e input-output system. This 
solution captures the dual effect by performing a 
second order Taylor series expansion of the expected 
future cost. It modifies the cautious solution by 
numerator and denominator correction terms. It also 
avoids problems of turn-off , burst and slow 
convergence, typical of the cautious solution. 



Chapter 3 


An Adaptive Dual Controller for 
a Dynamic MIMO Syatem. 

3.1 INTRODUCTION 

In this chapter a second order Taylor series 
expansion of the future expected cost is performed 
about a nominal trajectory and a dual controller is 
developed for a MIMO dynamic (ARMA of lag one) 
model. The cautious [Wl, SI, M3) and the new dual 
controller are applied to a MIMO ARMA system. Monte 
Carlo simulations, based on parametric and 
nonparametric statistical analysis, indicate that the 
dual controller prevents the turn-off phenomenon and 
slow convergence prevalent with a cautious solution. 

Section 2 gives the problem formulation. The 
approximate dual controller with a two-step horizon 
for the MIMO system is derived in Section 3. The 
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control solution is obtained by approximating the 
solution of the stochastic dynamic programming 
equation. A second order Taylor series expansion of 
the expected future cost is performed about a nominal 
trajectory and this leads to a dual control solution 
in a closed form. Following the derivations of the 
controller, a summary of the algorithm is given. 
Section 4 describes the simulation of the plant and 
compares the performances of the cautious and the dual 
solutions. Section 5 concludes the chapter. 
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3.2 PROBLEM FORMULATION 


The MIMO system to be controlled is described by 


y(k) = -A y(k-l) ♦ B u(k-i) + e(k) (43) 


where 


E[e(k)] - 0 ; E[e(k) e'(j)] - W6 kj { 44 ) 


The parameter matrices A and B are unknown. 
This model describes some industrial processes like an 
ore crushing plant, or a heat exchanger [ A 2 ] . The 
unknown elements of A and B comprise the parameter 


vector 8(k) whose estimate at time k is 

§(k) with covariance matrix P(k) . The 
parameter vector is designated as 

0(k) £ [a', | b', | a' 2 | b' 2 I I a'„ I b' n ]' ( 45 ) 



i 

: 62 

where n is the dimension of the output vector y ( k ) 
and a' t , b'j are the ith row of the matrices 
A and B , respectively. Assuming the parameters are 
time-invariant we have 

eCkfl) - 8(k) [48) 

A measurement matrix H(k) is defined as 

i 

$ H(k) £ diag [-y'(k) I u'(k) . -y'(k) | u'(k) ] (47) 

where H ( k ) has n rows. For a better understanding 
of the form of this matrix please refer to (81) 

With these definitions the measurement model is 


y ( k ) = H(k-l) 0 ( k— i ) + e(k) 


(48) 
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The performance criterion to be minimized is the 
expected value of the cost from step 0 to N , 


J (0) - E{C(0) } 



(yCk+1) - y r )'0(k){y(k^n - y p > I l k 


] 


( 49 ) 


where Q(k) is diagonal and I k is the cumulated 


information at time k 
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3.3 DUAL CONTROL WITH A twO-STEP HORIZON 

First the controller is derived and then a summary 
of the algorithm is provided. 

A dual control solution with a two-step horizon is 
obtained by minimizing (49) with respect to the 
control u(0) for the multidimensional plant 
(43)-(46). This is obtained by solving the general 
equation of Stochastic Dynamic Programming [B2, B6, 
B 7 ] 

J*(k) = min E{C(k) + J*(k+1) I I k > k=N-l 1,0 (50) 

u(k) 

where J* ( k ) is the cost to go from k to 

N and I k is the cumulated information at time k 

when the control u(k) is to be applied. 
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Thus for a two-step horizon we have 

E{C(k) ♦ 

-min E[{y(k-1) - y }* Q(k) (y(k+i) - y > ♦ J'k-u -2 * * k 1 ( 51 ) 
u(k) r r 

where J *k+i,k +2 * 3 the optimal cost at the 

last step and is obtained by minimization of 
J k+l,k+2 • 

The cautious control with a one step sliding 
horizon at k + 1 is given by 

u(k+l) = [E{B'Q(k+l)B II** 1 } ] -1 E[B'Q(k+l){Ay(k+l) + y p } I I k+1 ] (52) 
This helps us in obtaining the optimal cost to go 


at the penultimate stage. 
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The cost from step k + 1 to k+2 is, 

^k+i,k +2 - tr QCk + l)W ♦ E[ {Ay(k-*-l) *■ y p >'Q(k-i){ Ay(k+1) ♦ y p ) 

♦ u'(k+l)B'Q(k+l)Bu(k+l) - 2{Ay(k+i) ♦ y r >'Q(k+l)Bu(k+l) | I k+1 ] (53) 

and inserting (52) into (53) the optimal cost at the 
last step is, 

J* k+J ,k-2 " tr OCk*l)W ♦ E( { Ay(k+1) ♦ y p ]'Q(k+l){Ay(k+l) ,♦ y r >II k+1 ] 

- E[ { A y(k+l) * y r )'Q(k+l)B I I k+1 ](E{B'Q(k-*-l)B | l k+1 ) ]”\ 

E[B'Q(k + l) { A y(k+l) ♦ y)|l k+1 ] (54) 

r 

where E { - I I k+t } is the conditional 
expectation given the available information I k+ * 


The parameter vector estimate Q(k+i) and the 
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associated covariance matrix P(k+1) are obtained 
from a Kalman filter according to 


K(k+1) = P(k) H'(k) [H(k) P(k) H'(k) - W] 1 (55) 


0(k+l) - 0(k) ♦ K(k+i) [ y(k-*-l)) - H(k) 0(k)] 

= 9(k) + K(k+1) v(k+l) (56) 

P(k+1 ) - P(k) - P(k) H'(k). (H(k) P(k) H'(k) «• W] _1 H(k) P(k)(57) 

From (54) it is clear that J* k+lk+2 i s 
a nonlinear function of the estimated parameter 
vector 0 ( k + 1 ) and covariance P ( k 1 ) But the 

estimated vector § ( k «■ 1 ) and the covariance 
P(k + 1) are not known until the control u(k) is 


applied. 
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A control u(k) with a two-step horizon can be 
obtained from (51) if a second order Taylor series 
expansion of J * k+ . lk+2 is performed about a 
suitable nominal trajectory. Here the nominal 
trajectory is defined by 

1) a nominal parameter estimate 0(k+l)-0(k) 

2) a nominal control u(k) 

3) a nominal covariance P(k+1) obtained by 

using u(k) 

4) a nominal measurement y(k*l) obtained by 

i 

§ ( k ) 


using u ( k ) 


and 
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Expansion of (54) about this nominal 
re s u 1 ts in 


J \+U+2 " + J*y(k + l)[y(k+l) - y(k+l) ] 

* ^ [y(k+l) - y(k+l) ]'J yy (k+l)[y(k+l) 

♦ J' e (k+1) [0(k+l) - 0(k) ] * tr [ J p (k+i) (P(k+1) 

+ | I0(k+1) - 0(k)rJ ee (k+i)[0(k*l)-0(k)] 

where J, is the zeroth order term and 
sensi ti vi ties are 


J y (k+i) 6 



trajectory 


- y(k+l)] 

- P(k+1)}] 

(58) 

the cost 

(59) 


j ( k + 1 ) £ P a2j *k+l,k+2 "1 

yy l I 3y.(k*l)3y.(k-i) J 


( 60 ) 



«d cl t <a> 
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J 0 (k*l) ^ 


r ^*k*ijk+2 "i 

L a0,(k + i) J 


j e0 (k-n $ 




a 2 j 


k+l.k+2 


aejCk+nae^k+n 



jp(k+n £ 


[ dJ *k+l.k+2 I 

3P ij (k+l) J 


The above sensitivities are eval 
( k ) , P ( k ♦ 1 ) and y ( k + 1 ) 

ij ( k + 1 ) is the ij-th element of the 
latrix associated with the parameter 

j(k + l) and 8j(k + l) 

Under Gaussian assumption for the noise 
y(k+l) - y(k-*-l) ~ Jf [p, V] 


( 61 ) 

( 62 ) 

( 63 ) 

u a t e d at 
; and 

covariance 

estimates 


( 64 ) 
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where the mean is 

M - E{H(k)0(k) - e(k+l) - H(k)0(k)|l k } 

= [ H C k } - H(k)]0(k) (65) 

and the covariance is 

V = E[ { y(k+l) - y(k+l) - p}{y(k+i) - y(k+l) - M>'II k > 

= H(k) P(k) H'(k) * W (66) 

With the choice of the nominal path as defined 
earlier and using (55), (65) and (66) the conditional 
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expected value of (54) is 

E{J* k+u+2 H k > Ji - J' y (k+l)[H(k) - H(k)]§(k) + 

^ P'J yy (k+l)p ♦ i tr[J yy (k+l) V] * itr[j 8e (k+mp(k) - P(k*l)}] 

♦ tr ( J p (k+l)(P(k+l) - P(k+1) } 1 (67) 

The above expected future cost (67) is a function 
of the nominal parameters multiplied by appropriate 
sensitivity functions J y ( k + 1 ) , J yy ( k ♦ 1 ) 

J 00 ( k 1 ) and Jp(k + i) . These 

sensitivities introduce the dual effect into (51) 
which is then used to yield u(k) . It must also be 
noted that the covariance P(k+1) is nonlinear in 
u(k) and is not yet known. Hence a second order 
expansion of P ( k ♦ 1 ) is proposed about a nominal 
control u(k) and a nominal covariance P(k+1) 
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in order to obtain a (suboptima]) dual solution 
u D ( k ) in a closed form from (51). 

This expansion is performed as follows 


P(k-l) - P(k-l) ♦ 2e i e',{pj]kk-*'l)[u(k) - u(k)] 

»,j J 

* | [ u ( k ) - 0(k)]'Pi j u (k*l)[u(k) - u(k)]> (68) 


with superscript here denoting matrix element, e. 
the i-th cartesian basis vector and 


Py (k+1) 


A 9P 1J (k-l) 
3u(k) 



> .j 


( 69 ) 


evaluated at P ( k + 1 ) and u(k) and 1 the 


number of unknown parameters. 
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Now a (suboptimal ) dual solution u D ( k ) can be 

obtained from (51) using (67)-(69) and is given in 
closed form by 


u 0 (k) - [E{B'Q(k)B|I k > + FT 1 (E{B'Q(k)(Ay(k) + y p 3 I I k } ♦ f](70) 


where the elements of the matrix F and those of the 
vector f are given by 

- i tr [ - i ] 

* $ tr[j yy (k.i)(f*i$- 6m) J (71) 

i,j=l,...,m 



M|i- 
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and 

f . - - 8tk 0 

■ J tr Q (jp(k*l) - j Jestk**)} 3 | u tk t kV ] 

* J j| 1 tr [ <V k “> - 5 J « lw » 

tr | l tr [ J yy' k * 1) (f^r § ' k 0[5=]TE7 8k) ) ]°)' k >(72) 

i=l m 

and m is the dimension of the control vector. 


It is clear from (70) that this approximate dual 
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solution u D (k) is a modification of the cautious 
solution by the cost sensitivity terms. The cautious 

solution is (70) with F-0 and f - 0. These account 
for the dual effect. The implementation of this 
second order dual solution is performed by the method 
described below. 

i 

Algorithm Summary 

1) Compute the sensitivity functions 

J e0 ( k ♦ 1 ) , J p ( k ♦ i ) , J y ( k * 1 ) 

J yy ( k ♦ 1 ) for (67) with 

9(k+l)«6(k) and the nominal values 
u(k) , P ( k + 1 ) , y ( k + 1 ) defining the 

nominal path. 

2) Search on (51) with (67) [with the sensitivity 

functions computed above, starting with first 
nominal values 0 ( k ) , P ( k + 1 ) ) oyer 

u(k) to obtain an improved nominal for which 
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J *k,k +2 is lower. This search is done by 

selecting a first coarse grid. A grid search is 
necessary to avoid locking in on a local minimum. 
Then another grid is chosen about the latter 
control over a narrower interval and from a second 
search u 1 ( k ) is obtained. This is the control 
about which the covariances are expanded. It is 
not the control law applied. 

3) Using u 1 ( k ) compute the covariance 
sensitivities P u ( k 1 ) , P uu (k + 1); 

together with the previously computed cost 
sensitivities J 00 ( k + 1 ) , J p ( k ♦ 1 ) , 

Jyy ( k + 1 ) , Jy ( k «■ 1 ) obtain F , f 

defined in (71), (72). Finally the control to be 

applied, u 0 (k) , is calculated from its 

expression (70) in terms of the various 
expectations and sensitivities. 

The iteration described in step 2 above is carried 
out to obtain better covariance sensitivities. The 
control Up(k) could have been obtained directly 
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from (70) by skipping step 2 above; however, as 
indicated in [M2, M3] this results in unsatisfactory 
performance. With this iteration of step 2, the 
"improved" sensitivities yield good performance as 


shown in the next section. 
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3 A SIMULATION RESULTS 

Performance is evaluated from 500 Monte Carlo runs 
for the following controllers: 

1) Heuristic Certainty Equivalence [B2] (with a 

one step horizon), 

2) One-step-ahead cautious controller, and 

3) Dual controller based upon sensitivity 

functions (with a two-step horizon) derived 
in Sec. 3. 

The plant equations for a 2-input 2-output system 

are 

y^k+i) = - a^y^k) - a 12 y 2 (k) ♦ b„u,(k) b 12 u 2 (k) ♦ e t (k+l)(73) 


y 2 (k+l) = -a^y^k) - a 22 y 2 (k) «■ b 21 u t (k) + b 22 u t (k) e 2 (k+l)(74) 



w here 


80 


E{e(k)e'(j)> = W6 kJ = diag(W t , W 2 ): 
W, - 7.52 2 i W 2 - 43 2 


(75) 


The true values of the parameters are 


a lt ” 

.8 

b « “ 

-74.84 

a, 2 - 

.1 

b t2 “ 

-51.04 

a 2t = 

.2 

b 21 = 

53.31 

a 22 ° 

.75 

b 22 - 

-82.56 


(76) 


Only the gain parameters (B matrix) are considered 
unknown for testing the dual effect and their initial 
estimates where generated as : J\f ( , 

bjj), i, J = i , 2. This choice of system was 

motivated by the helicopter vibration study [M2]. 
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A large initial uncertainty is chosen in the 
parameter estimates in order to test the learning 
capabilities of the various adaptive algorithms. The 
cost weighting matrices are 

Q(k) = diag (q^ q 2 ) : q t - 1.0 , q 2 = 1.0 (77) 

The desired response is 

y r - [-18 80]' (78) 

For the model chosen (73)-(78) the optimal control 
solution is 


uj - 1.0 , u 2 - -1.0 


(79) 


In terms of the notation of (45) and (47) 


§ 00 ^ [a„ a 12 


a 22 


^11^3 ^2i 00 a 21 


b 2t (k) b 22 (k)]' (80) 
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and 


HCk) = 


“ -y t (k 

o 


-y,(k) -y 2 (k) u,(k) u 2 (k) 0 0 

0 0 0 -y^k) -y 2 (k) 


0 0 “ 
u,(k) u 2 (k) 


( 81 ) 


The controllers are implemented with a sliding 
horizon for a total of 40 time steps. The 


evaluation criterion is 


C k - (y(k+l) - yY Q(k) (y(k+l) - y r ) 


( 82 ) 


Analysis of the Monte Carlo Average Costs 


Comparisons are made between the performances of 
the cautious and the dual algorithm on the system and 
a statistical significance analysis is done using the 
normal theory approach (i.e., it is assumed that the 
central limit theorem holds for the sample mean from 


C-*- 
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a large number of runs) [M3]. Tables 7-10 contains 
the results of the simulation runs. Table 7 compares 
the average cost C k over 500 Monte Carlo runs 
for the first 40 time steps for HCE, cautious and the 
dual algorithms, with a control limiter I u t I s 2 , 
1 = 1 , 2 . 

Clearly it is seen that the cumulative average 
cost is the lowest for the dual controller. The HCE 
incurs an excessive penalty in time step 1 because of 
lack of caution. The cagtious controller is overly 
cautious and exhibits slow convergence. However, the 
dual controller incurs less penalty in step 1 than the 
HCE and makes a Judicious choice of caution and 
probing to learn the parameters fast. Figure 8 
compares the performances of the three algorithms for 
500 Monte Carlo runs. 

Table 8 provides a statistical significance test 
and shows the improved performances of the dual 
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solution from time step 4 onwards with at least 98% 
confidence. 

Table 9 indicates the percentage of runs the cost 
exceeds 2000 for the two algorithms. This threshold 
of 2000 is selected from a sample distribution study 
of the cost at each time step. Table 10 shows the 
percentile test [M3, N 1 ] comparing the cautious and 
the dual solution. They clearly indicate from time 
step 4 onwards the light tailed nature of the 
distribution of the cost yielded by the new dual 
control algorithm. < ,j 

Individual Time History Runs 


Analysis of the Monte Carlo Average Cost indicates 
the improvement offered by the dual solution; it 
provides no information about the cautious control's 
turning off phenomenon [Si, W1 ]. Hence a careful 
investigation of the individual runs is required to 


examine these occurrences. 
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The turn-off phenomenon is observed in many runs 
among the 500 Monte Carlo simulation while using the 
cautious controller; run 90 is a typical example of 
it. Both components are almost off betwen time steps 
0 and 20 during which the dual controller already 
identified the parameters and reached the desired 
trajectory. Figures 9-12 portray this result. 



86 




HCE 

Cautious 

Dual 

Time 
Step k 

c k 

k 

sc. 

i*l 


k 

sc, 

l-l 


k 

i-l 

1 

14851 

14851 

3623 

3623 

6944 

6944 

2 

6241 

21092 

3961 

7584 

6722 

13666 

3 

3578 

24670 

3246 

10830 

4 230 

17896 

4 

1616 

26286 

2836 

13666 

1866 

19762 

5 

1354 

27640 

2505 

16171 

1492 

21254 

6 

807 

28447 

2154 

18325 

953 

22207 

7 

593 

29040 

1921 

20246 

700 

22907 

8 

462 

29502 

1670 

21916 

582 

23489 

9 

397 

29899 

1623 

23539 

535 

24024 

10 

34 7 

30246 

1327 

24866 

385 

24409 

40 

77 

34444 

281 

43810 

89 

29178 


ble 7. 


Average Costs for the three algorithms in 
the dynamic model with a limiter C500 Monte 
Carlo Runs; I u, I < 2 . 0 , 1 u 2 | s 2 . 0 ) 
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Time Step 

Test Statistic 

Estimated Improvement 

k 

*k 

EI k X 

1 

-8.1 

-91 

2 

-5.3 

-69 

3 

-2.2 

-30 

4 

3.5 

34 

5 

3.3 

40 

6 

6.0 

56 

7 

6.3 

64 

8 

6.5 

65 

9 

6.5 

67 

10 

5.7 

71 

11 

6.3 

76 

12 

5.6 

70 

13 

5.9 

82 

14 

5.2 

62 

15 

5.5 

79 

16 

4.9 

70 

17 

4.5 

78 

18 

4.4 

74 

19 

4.4 

76 

20 

4.3 

76 


Table 8. Statistical significance test for 

comparison of cautious and the dual 
algorithm in the dynamic model with a 
limiter (500 Monte Carlo Runs; 

I uj | < 2 . 0 , I u 2 | < 2 . 0 ) 



Time Step 


test statistics at K go 


Table 


1 

— 

2 

-- 

3 

-- 

-1 

10 

5 

19 

6 

23 

7 

32 

8 

35 

9 

57 

10 

37 

11 

40 

12 

40 

13 

40 

14 

16 

15 

32 

16 

11 

17 

16 

18 

16 

19 

18 

20 

25 


10. Percentile test for comparison of 

cautious and the dual algorithm in the 
dynamic mode! with a limiter (500 Monte 
Carlo Runs; | I £ 2 . 0 , | u 2 I £ 2 . 0 ) 



COST x 1000 


CAUTIOUS, DUAL AND HCE 



0 5 10 15 20 25 30 35 40 


TIME STEP 

Fig. 8 Time history of the average cost using the 
heuristic certainty equivalence, cautious 
and the dual controllers. (500 Monte Carlo 
runs; |u |<2.0, |u„j_<2.0) 
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CAUTIOUS AND DUAL 



TIME STEP 


Fig. 11 Time history of control 1 using the cautious 
and the dual algorithms for run 90 (500 Monte 
Carlo runs; |u^|£2.0; | u 2 | j< 2 . 0 ) 
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3.5 CONCLUSIONS 

A new adaptive dual control solution has been 
developed for sn ARMA MIMO system. This solution 
utilizes the dual effect by performing a second order 
Taylor series expansion of the expected future cost. 
It modifies the cautious solution by numerator and 
denominator correction terms. Analysis of the 
simulation runs has shown that this new dual control 
solution applied to a multi-input multi-output model 
improves over the cautious controller. The key 
improvement is in the avoiding of situations like 
turn-off and slow convergences, typical of the 


cautious solution. 



APPENDIX A 


Statistical Significance in the 
Comparison of Controller Performance 

Two control algorithms are compared, by performing 
a Monte Carlo simulation. S independent runs with 


the two algorithms, 

under the same homog 

eneous 

c o n d 

itions, yield 

a set of 

i . i . d . s a 

m p 1 e s 

r (D 
L *i k » 

C WJ ; 0 i 

S k * 1 1 » 

2 , . . . , S 

from 

two 

d i s t r 

ibutions with 

true but 

unknown 

means 


and , r 

e s p e c t 

1 v e 1 y , 

f o r 


(A.l) 
(A.2) 

indicating that algorithm 1 is better than 2 for time 


each time step k. 


The sample means 


;(j) 


- 1 2 c ! 


(j) 


i-1 


J-1.2 


are point estimates of the respective true means. 
A statement that 


c? « c «* 
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step k has to be accompanied by a level of 
significance a of type I error. 

Thus we test the hypothesis 

H 0 : A = s 0 (algorithm 1 not better) (A. 3) 

against the one sided alternative 

H t : A = J ( j^ - > 0 (algorithm 1 better) (A. 4) 

for a particular a level at each time step k. 

This probability of error a is defined as 

a = P{accept HjIHq true) (A.5) 

Since we get a set of data of the performances of 
the two algorithms on the plant under similar 
conditions we regard it as a set of naturally paired 
observations. 

We consider the sample differences 


A 


ik 




(A. 6) 
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and this set of differences A ik represents a 
sample with mean 


A. ■ j< 2) - j‘» 


(A.7) 


Thus we have reduced the two-sample problem to a 
one-sample problem. The hypothesis is tested by 
examining whether A k can be accepted as being 
positive with high confidence. The test statistic is 


Z k “ C T 


*k 


(A.8) 


where 


A k = § 

i-I 


(A.9) 


a 2 . I 
\ SCS-1) £ 


.2 (A ik - A k )' 


CA.10) 


The test statistic z k has a t - distribution 
with (S-l) degrees of freedom. For S large [ >5 0 ) z k 


has a normal distribution. Then we have 




S 

2 CA ik 

i=i 




2 


(A.11) 
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and the hypothesis H t is accepted if 

z k > y (A. 12) 

where y is taken from the normal distribution 
tables. For a one-sided test with a=0.05, one has 
>■ 1.64 5. 

The estimated improvement for each time step k 
is defined as 

c (2) - c (1) 

EI k £ k _ (2) k x 100Z CA.13) 

c k 

For our problem the costs have a probability 
density function which is not symmetric and also not 
normal. For this class of problems nonparametric 
tests for two samples are applicable [ IM 1 ] . A 
percentile test is recommended here to further 
substantiate that the vibrations obtained by using the 
dual and the cautious algorithms come from two 
different distributions and the tail of the 
distribution obtained from the dual is lighter than 
the tail of that obtained by the cautious algorithm. 
This test is described next. The two samples are 
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pooled together and a 90 percentile point denoted as 
K go is chosen. Then a 2x2 contingency table is 
computed as follows 



s K 9Q 

o 

CTl 

A 

Totals 

Dual 

a 

b 

a * b 

Cautious 

c 

d 

c * d 

Totals 

a * c 

b+ d 

n=a+b+c+d 


where a, b, c, d are the observed frequencies 
for the four cells. 

The x 2 (chi-square) value is obtained by 

x 2 - (a.SJC%ffuS<l)Ca*c} at 1 de 9 ree freedom CA.14) 
This should be greater than 3.8 at a = 0.05 to 
prove that the tail of dual is lighter than that of 
the cautious. 
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